An organism-specific method to rank predicted coding regions in Trypanosoma brucei.

نویسندگان

  • Shuba Gopal
  • George A M Cross
  • Terry Gaasterland
چکیده

Genome annotation in differently evolved organisms presents challenges because the lack of sequence-based homology limits the ability to determine the function of putative coding regions. To provide an alternative to annotation by sequence homology, we developed a method that takes advantage of unusual trypanosomatid biology and skews in nucleotide composition between coding regions and upstream regions to rank putative open reading frames based on the likelihood of coding. The method is 93% accurate when tested on known genes. We have applied our method to the full complement of open reading frames on Chromosome I of Trypanosoma brucei, and we can predict with high confidence that 226 putative coding regions are likely to be functional. Methods such as the one described here for discriminating true coding regions are critical for genome annotation when other sources of evidence for function are limited.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Depletion of the RNA-Binding Protein RBP33 Results in Increased Expression of Silenced RNA Polymerase II Transcripts in Trypanosoma brucei

We have characterized the RNA-binding protein RBP33 in Trypanosoma brucei, and found that it localizes to the nucleus and is essential for viability. The subset of RNAs bound to RBP33 was determined by immunoprecipitation of ribonucleoprotein complexes followed by deep sequencing. Most RBP33-bound transcripts are predicted to be non-coding. Among these, over one-third are located close to the e...

متن کامل

microRNA Detection and Target Prediction in Trypanosoma brucei

INTRODUCTION: MicroRNAs (miRNAs) constitutes an abundant family of tiny non-coding RNAs (ncRNAs) of about 19-25 nucleotides (nt(s)) in length encoded in genomes of many eukaryotes from nematode to human(1). Computer-based approaches for miRNA gene identification and target prediction are being considered as indispensable in miRNA research. We have developed a homology based computational approa...

متن کامل

The Genome Sequence of Trypanosoma brucei gambiense, Causative Agent of Chronic Human African Trypanosomiasis

BACKGROUND Trypanosoma brucei gambiense is the causative agent of chronic Human African Trypanosomiasis or sleeping sickness, a disease endemic across often poor and rural areas of Western and Central Africa. We have previously published the genome sequence of a T. b. brucei isolate, and have now employed a comparative genomics approach to understand the scale of genomic variation between T. b....

متن کامل

Characterization of Trypanozoon isolates using a repeated coding sequence and microsatellite markers.

Genetic variation of microsatellite loci is a widely used method for linkage analysis, individual identification or inter-population studies. Here we analyse a repeated DNA coding sequence and eleven new microsatellites identified within the Trypanosoma (Trypanozoon) brucei genome. Ninety-seven isolates belonging to the five species and subspecies Trypanosoma evansi, T. equiperdum, T. brucei br...

متن کامل

Comparative Genomics Reveals Multiple Genetic Backgrounds of Human Pathogenicity in the Trypanosoma brucei Complex

The Trypanosoma brucei complex contains a number of subspecies with exceptionally variable life histories, including zoonotic subspecies, which are causative agents of human African trypanosomiasis (HAT) in sub-Saharan Africa. Paradoxically, genomic variation between taxa is extremely low. We analyzed the whole-genome sequences of 39 isolates across the T. brucei complex from diverse hosts and ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Nucleic acids research

دوره 31 20  شماره 

صفحات  -

تاریخ انتشار 2003